Overview

Dataset statistics

Number of variables12
Number of observations1315
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory123.4 KiB
Average record size in memory96.1 B

Variable types

NUM11
CAT1

Warnings

zipcode is highly correlated with schooldist and 1 other fieldsHigh correlation
schooldist is highly correlated with zipcodeHigh correlation
council is highly correlated with zipcodeHigh correlation
lotarea is highly skewed (γ1 = 27.08598631) Skewed
df_index has unique values Unique

Reproduction

Analysis started2021-05-29 22:19:56.939270
Analysis finished2021-05-29 22:20:12.532016
Duration15.59 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct1315
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean413750.4935
Minimum7
Maximum858669
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:12.601032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile16463.2
Q1164320.5
median269590
Q3681588
95-th percentile797185.6
Maximum858669
Range858662
Interquartile range (IQR)517267.5

Descriptive statistics

Standard deviation280744.4121
Coefficient of variation (CV)0.678535534
Kurtosis-1.637268931
Mean413750.4935
Median Absolute Deviation (MAD)260956
Skewness0.1017818988
Sum544081899
Variance7.88174249e+10
MonotocityNot monotonic
2021-05-29T15:20:12.700559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
60211110.1%
 
287510.1%
 
46559610.1%
 
27513710.1%
 
79737810.1%
 
16045110.1%
 
74656110.1%
 
20141610.1%
 
26900110.1%
 
26796310.1%
 
Other values (1305)130599.2%
 
ValueCountFrequency (%) 
710.1%
 
2110.1%
 
3210.1%
 
3310.1%
 
39410.1%
 
ValueCountFrequency (%) 
85866910.1%
 
85544510.1%
 
85392910.1%
 
85364810.1%
 
85364610.1%
 

borough
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
MN
1141 
BK
 
95
QN
 
45
BX
 
33
SI
 
1
ValueCountFrequency (%) 
MN114186.8%
 
BK957.2%
 
QN453.4%
 
BX332.5%
 
SI10.1%
 
2021-05-29T15:20:12.804585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)0.1%
2021-05-29T15:20:12.867602image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:12.958006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

block
Real number (ℝ≥0)

Distinct735
Distinct (%)55.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1198.855513
Minimum4
Maximum15638
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.045936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile26.7
Q1779
median1104
Q31363
95-th percentile2443
Maximum15638
Range15634
Interquartile range (IQR)584

Descriptive statistics

Standard deviation1198.05812
Coefficient of variation (CV)0.9993348713
Kurtosis42.90812459
Mean1198.855513
Median Absolute Deviation (MAD)298
Skewness5.15274432
Sum1576495
Variance1435343.259
MonotocityNot monotonic
2021-05-29T15:20:13.148978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
16201.5%
 
1171110.8%
 
763100.8%
 
111890.7%
 
126980.6%
 
2180.6%
 
99370.5%
 
129570.5%
 
115870.5%
 
76260.5%
 
Other values (725)122292.9%
 
ValueCountFrequency (%) 
410.1%
 
510.1%
 
620.2%
 
810.1%
 
930.2%
 
ValueCountFrequency (%) 
1563810.1%
 
1561010.1%
 
1010110.1%
 
999810.1%
 
745910.1%
 

schooldist
Real number (ℝ≥0)

HIGH CORRELATION

Distinct27
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.240304183
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.255022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q32
95-th percentile20
Maximum31
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.030620993
Coefficient of variation (CV)1.422214241
Kurtosis8.788644772
Mean4.240304183
Median Absolute Deviation (MAD)0
Skewness3.052984874
Sum5576
Variance36.36838956
MonotocityNot monotonic
2021-05-29T15:20:13.335060image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
298875.1%
 
3987.5%
 
13382.9%
 
30302.3%
 
1211.6%
 
14171.3%
 
5161.2%
 
15141.1%
 
21141.1%
 
6100.8%
 
Other values (17)695.2%
 
ValueCountFrequency (%) 
1211.6%
 
298875.1%
 
3987.5%
 
480.6%
 
5161.2%
 
ValueCountFrequency (%) 
3110.1%
 
30302.3%
 
2890.7%
 
2720.2%
 
2530.2%
 

council
Real number (ℝ≥0)

HIGH CORRELATION

Distinct39
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.051711027
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.427080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile33
Maximum50
Range49
Interquartile range (IQR)2

Descriptive statistics

Standard deviation9.591047467
Coefficient of variation (CV)1.36010217
Kurtosis5.622520047
Mean7.051711027
Median Absolute Deviation (MAD)1
Skewness2.564630811
Sum9273
Variance91.98819151
MonotocityNot monotonic
2021-05-29T15:20:13.521101image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%) 
444333.7%
 
321916.7%
 
117813.5%
 
51128.5%
 
6836.3%
 
2654.9%
 
33513.9%
 
26292.2%
 
35171.3%
 
7141.1%
 
Other values (29)1047.9%
 
ValueCountFrequency (%) 
117813.5%
 
2654.9%
 
321916.7%
 
444333.7%
 
51128.5%
 
ValueCountFrequency (%) 
5010.1%
 
4890.7%
 
4750.4%
 
4310.1%
 
4210.1%
 

zipcode
Real number (ℝ≥0)

HIGH CORRELATION

Distinct100
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10167.04715
Minimum10001
Maximum11691
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.629126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10001
Q110016
median10022
Q310038
95-th percentile11212
Maximum11691
Range1690
Interquartile range (IQR)22

Descriptive statistics

Standard deviation371.1828445
Coefficient of variation (CV)0.03650842167
Kurtosis4.173415413
Mean10167.04715
Median Absolute Deviation (MAD)10
Skewness2.417939814
Sum13369667
Variance137776.704
MonotocityNot monotonic
2021-05-29T15:20:13.726149image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100221088.2%
 
10017957.2%
 
10019906.8%
 
10016796.0%
 
10018785.9%
 
10036715.4%
 
10001695.2%
 
10023584.4%
 
10128393.0%
 
10028382.9%
 
Other values (90)59044.9%
 
ValueCountFrequency (%) 
10001695.2%
 
10002161.2%
 
10003151.1%
 
10004282.1%
 
10005292.2%
 
ValueCountFrequency (%) 
1169120.2%
 
1143510.1%
 
1143310.1%
 
1141510.1%
 
1137910.1%
 

landuse
Real number (ℝ≥0)

Distinct7
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.219011407
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.816170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q14
median4
Q35
95-th percentile5
Maximum8
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9464041998
Coefficient of variation (CV)0.2243189479
Kurtosis3.027967883
Mean4.219011407
Median Absolute Deviation (MAD)1
Skewness0.9377365481
Sum5548
Variance0.8956809093
MonotocityNot monotonic
2021-05-29T15:20:13.884185image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
549437.6%
 
448236.7%
 
330923.5%
 
8262.0%
 
620.2%
 
210.1%
 
110.1%
 
ValueCountFrequency (%) 
110.1%
 
210.1%
 
330923.5%
 
448236.7%
 
549437.6%
 
ValueCountFrequency (%) 
8262.0%
 
620.2%
 
549437.6%
 
448236.7%
 
330923.5%
 

lotarea
Real number (ℝ≥0)

SKEWED

Distinct1198
Distinct (%)91.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43980.41749
Minimum1506
Maximum5048550
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:13.978209image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1506
5-th percentile4895.3
Q110847
median21275
Q341693
95-th percentile136977.2
Maximum5048550
Range5047044
Interquartile range (IQR)30846

Descriptive statistics

Standard deviation153041.8282
Coefficient of variation (CV)3.479772066
Kurtosis872.7974641
Mean43980.41749
Median Absolute Deviation (MAD)12534
Skewness27.08598631
Sum57834249
Variance2.342180119e+10
MonotocityNot monotonic
2021-05-29T15:20:14.085232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
987580.6%
 
740670.5%
 
1004260.5%
 
753150.4%
 
602550.4%
 
1255240.3%
 
502140.3%
 
2410040.3%
 
493840.3%
 
750040.3%
 
Other values (1188)126496.1%
 
ValueCountFrequency (%) 
150620.2%
 
194210.1%
 
202510.1%
 
214310.1%
 
215010.1%
 
ValueCountFrequency (%) 
504855010.1%
 
83394510.1%
 
74695610.1%
 
65937510.1%
 
62270010.1%
 

bldgarea
Real number (ℝ≥0)

Distinct1298
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean460533.981
Minimum1344
Maximum13540113
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:14.191256image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1344
5-th percentile74242.1
Q1184080.5
median323029
Q3541097.5
95-th percentile1263334.4
Maximum13540113
Range13538769
Interquartile range (IQR)357017

Descriptive statistics

Standard deviation601261.4827
Coefficient of variation (CV)1.305574632
Kurtosis199.8663707
Mean460533.981
Median Absolute Deviation (MAD)163517
Skewness10.75492508
Sum605602185
Variance3.615153705e+11
MonotocityNot monotonic
2021-05-29T15:20:14.304282image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
62380630.2%
 
33260830.2%
 
22500020.2%
 
43100020.2%
 
5064820.2%
 
17700020.2%
 
22440020.2%
 
27233420.2%
 
9642020.2%
 
21624720.2%
 
Other values (1288)129398.3%
 
ValueCountFrequency (%) 
134410.1%
 
314610.1%
 
328010.1%
 
1200010.1%
 
2380510.1%
 
ValueCountFrequency (%) 
1354011310.1%
 
883750010.1%
 
369353910.1%
 
322123710.1%
 
290731510.1%
 

numfloors
Real number (ℝ≥0)

Distinct62
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.33307985
Minimum20.5
Maximum104
Zeros0
Zeros (%)0.0%
Memory size10.3 KiB
2021-05-29T15:20:14.417307image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum20.5
5-th percentile21
Q124
median30
Q338
95-th percentile54
Maximum104
Range83.5
Interquartile range (IQR)14

Descriptive statistics

Standard deviation11.36545646
Coefficient of variation (CV)0.3515117184
Kurtosis3.653381985
Mean32.33307985
Median Absolute Deviation (MAD)7
Skewness1.625893713
Sum42518
Variance129.1736005
MonotocityNot monotonic
2021-05-29T15:20:14.515329image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2115011.4%
 
22896.8%
 
23785.9%
 
24725.5%
 
25705.3%
 
26695.2%
 
32554.2%
 
30524.0%
 
31473.6%
 
27463.5%
 
Other values (52)58744.6%
 
ValueCountFrequency (%) 
20.530.2%
 
2115011.4%
 
22896.8%
 
22.510.1%
 
23785.9%
 
ValueCountFrequency (%) 
10410.1%
 
9010.1%
 
8820.2%
 
7810.1%
 
7710.1%
 

unitstotal
Real number (ℝ≥0)

Distinct495
Distinct (%)37.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean218.8486692
Minimum0
Maximum10948
Zeros9
Zeros (%)0.7%
Memory size10.3 KiB
2021-05-29T15:20:14.617352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q136
median139
Q3306
95-th percentile653.2
Maximum10948
Range10948
Interquartile range (IQR)270

Descriptive statistics

Standard deviation387.4176722
Coefficient of variation (CV)1.770253727
Kurtosis450.1455406
Mean218.8486692
Median Absolute Deviation (MAD)118
Skewness16.94744441
Sum287786
Variance150092.4527
MonotocityNot monotonic
2021-05-29T15:20:14.718375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11118.4%
 
2463.5%
 
3161.2%
 
184131.0%
 
4131.0%
 
40100.8%
 
1790.7%
 
2990.7%
 
090.7%
 
6580.6%
 
Other values (485)107181.4%
 
ValueCountFrequency (%) 
090.7%
 
11118.4%
 
2463.5%
 
3161.2%
 
4131.0%
 
ValueCountFrequency (%) 
1094810.1%
 
302710.1%
 
170610.1%
 
161510.1%
 
160410.1%
 

yearbuilt
Real number (ℝ≥0)

Distinct119
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1972.610646
Minimum0
Maximum2020
Zeros3
Zeros (%)0.2%
Memory size10.3 KiB
2021-05-29T15:20:14.828399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1920.7
Q11962
median1980
Q32006
95-th percentile2018
Maximum2020
Range2020
Interquartile range (IQR)44

Descriptive statistics

Standard deviation99.5519592
Coefficient of variation (CV)0.05046711037
Kurtosis350.5855389
Mean1972.610646
Median Absolute Deviation (MAD)23
Skewness-17.79287381
Sum2593983
Variance9910.592581
MonotocityNot monotonic
2021-05-29T15:20:14.934424image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2015413.1%
 
1963382.9%
 
2018342.6%
 
1930332.5%
 
1987332.5%
 
1964322.4%
 
2006312.4%
 
2016302.3%
 
1929282.1%
 
1986282.1%
 
Other values (109)98775.1%
 
ValueCountFrequency (%) 
030.2%
 
188310.1%
 
188510.1%
 
189510.1%
 
189610.1%
 
ValueCountFrequency (%) 
2020221.7%
 
2019272.1%
 
2018342.6%
 
2017251.9%
 
2016302.3%
 

Interactions

2021-05-29T15:19:59.997763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.092785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.193808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.288829image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.382851image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.479872image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.575894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.666915image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.765937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.852957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:00.945978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.046000image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.148023image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.262049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.369073image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.473097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.585122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.694147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.800171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:01.911196image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.012219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.116242image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.230268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.324290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.432314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.529336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.625357image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.727381image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.827404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:02.923424image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.027449image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.120469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.217491image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.322515image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.413536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.515559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.610580image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.702603image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.802427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.900449image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:03.994473image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.094496image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.183515image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.275536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.379074image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.479096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.591121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.697145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.799181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:04.907205image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.014230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.117253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.228278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.326300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.429323image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.539348image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.638371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.747395image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.848418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:05.947441image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.054466image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.159489image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.260512image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.368547image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.466416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.567439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.675464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.767484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.870507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:06.966530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.059551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.160614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.258649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.353685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.454740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.545774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.639835image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.741874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.841897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:07.953977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.060085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.163128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.272170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.380213image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.483258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.594135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.694172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.797211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.910257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:08.996385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.095382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.186383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.274383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.370382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.464130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.555151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.652173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.738192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.826212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:09.924234image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.014256image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.116278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.211299image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.302320image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.402343image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.499364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.592385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.693408image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.782429image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.875450image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:10.976472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.077875image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.190901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.296925image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.399948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.509973image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.618998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.723021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.834047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:11.934882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:12.038905image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-05-29T15:20:15.039447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-29T15:20:15.199484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-29T15:20:15.358520image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-29T15:20:15.520556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-05-29T15:20:12.236949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-05-29T15:20:12.441997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

df_indexboroughblockschooldistcouncilzipcodelanduselotareabldgareanumfloorsunitstotalyearbuilt
0658320MN12652.04.010020.05.0107766.02117061.070.0109.01937.0
1201620MN12992.04.010017.05.017573.0629323.043.017.01982.0
2156742MN13082.04.010022.05.081325.01526121.039.02.01969.0
3346605BK307714.034.011206.03.0257500.0751412.021.0772.01965.0
4270630MN15362.05.010128.03.0153080.0666393.042.0648.01975.0
5630292MN102.01.010004.05.015445.0336025.024.097.01930.0
6128222MN15052.04.010128.04.022102.0302439.032.0212.01984.0
7164312MN6992.03.010001.04.022219.0143052.025.041.02014.0
8269434MN13182.04.010017.04.07537.0109822.035.08.02015.0
9200453MN9972.04.010036.05.016820.0471985.048.01.01988.0

Last rows

df_indexboroughblockschooldistcouncilzipcodelanduselotareabldgareanumfloorsunitstotalyearbuilt
1305642313MN14852.05.010021.08.039547.0757439.024.01.02015.0
1306200583MN10242.03.010019.05.023900.0762619.035.01.01987.0
1307153741MN13142.04.010016.08.019701.0279254.025.024.02001.0
1308608074BX26237.017.010455.03.0166139.0422400.022.0471.01960.0
1309746569MN12802.04.010017.05.057282.01028194.026.013.01919.0
1310268057MN8402.04.010018.05.04148.088551.034.0173.02018.0
1311274079MN21706.010.010040.03.096675.0223200.021.0205.01959.0
1312680761MN8612.04.010016.04.08400.0175687.035.0166.02008.0
1313200392MN8112.03.010018.05.019750.0408511.022.088.01925.0
1314227378MN10372.03.010036.05.03292.075902.029.01.02014.0